45 research outputs found
Mining Discourse Treebanks with XQuery
Proceedings of the Ninth International Workshop
on Treebanks and Linguistic Theories.
Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti.
NEALT Proceedings Series, Vol. 9 (2010), 245-256.
© 2010 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/15891
Semantics-based Question Generation and Implementation
This paper presents a question generation system based on the approach of semantic rewriting. The state-of-the-art deep linguistic parsing and generation tools are employed to convert (back and forth) between the natural language sentences and their meaning representations in the form of Minimal Recursion Semantics (MRS). By carefully operating on the semantic structures, we show a principled way of generating questions without ad-hoc manipulation of the syntactic structures. Based on the (partial) understanding of the sentence meaning, the system generates questions which are semantically grounded and purposeful. And with the support of deep linguistic grammars, the grammaticality of the generation results is warranted. Further, with a specialized ranking model, the linguistic realizations from the general purpose generation model are further refined for our the question generation task. The evaluation results from QGSTEC2010 show promising prospects of the proposed approach
Dual-mode adaptive-SVD ghost imaging
In this paper, we present a dual-mode adaptive singular value decomposition
ghost imaging (A-SVD GI), which can be easily switched between the modes of
imaging and edge detection. It can adaptively localize the foreground pixels
via a threshold selection method. Then only the foreground region is
illuminated by the singular value decomposition (SVD) - based patterns,
consequently retrieving high-quality images with fewer sampling ratios. By
changing the selecting range of foreground pixels, the A-SVD GI can be switched
to the mode of edge detection to directly reveal the edge of objects, without
needing the original image. We investigate the performance of these two modes
through both numerical simulations and experiments. We also develop a
single-round scheme to halve measurement numbers in experiments, instead of
separately illuminating positive and negative patterns in traditional methods.
The binarized SVD patterns, generated by the spatial dithering method, are
modulated by a digital micromirror device (DMD) to speed up the data
acquisition. This dual-mode A-SVD GI can be applied in various applications,
such as remote sensing or target recognition, and could be further extended for
multi-modality functional imaging/detection
Mining Implicit Relevance Feedback from User Behavior for Web Question Answering
Training and refreshing a web-scale Question Answering (QA) system for a
multi-lingual commercial search engine often requires a huge amount of training
examples. One principled idea is to mine implicit relevance feedback from user
behavior recorded in search engine logs. All previous works on mining implicit
relevance feedback target at relevance of web documents rather than passages.
Due to several unique characteristics of QA tasks, the existing user behavior
models for web documents cannot be applied to infer passage relevance. In this
paper, we make the first study to explore the correlation between user behavior
and passage relevance, and propose a novel approach for mining training data
for Web QA. We conduct extensive experiments on four test datasets and the
results show our approach significantly improves the accuracy of passage
ranking without extra human labeled data. In practice, this work has proved
effective to substantially reduce the human labeling cost for the QA service in
a global commercial search engine, especially for languages with low resources.
Our techniques have been deployed in multi-language services.Comment: Accepted by KDD 202
Learning to Rank Question Answer Pairs with Holographic Dual LSTM Architecture
We describe a new deep learning architecture for learning to rank question
answer pairs. Our approach extends the long short-term memory (LSTM) network
with holographic composition to model the relationship between question and
answer representations. As opposed to the neural tensor layer that has been
adopted recently, the holographic composition provides the benefits of scalable
and rich representational learning approach without incurring huge parameter
costs. Overall, we present Holographic Dual LSTM (HD-LSTM), a unified
architecture for both deep sentence modeling and semantic matching.
Essentially, our model is trained end-to-end whereby the parameters of the LSTM
are optimized in a way that best explains the correlation between question and
answer representations. In addition, our proposed deep learning architecture
requires no extensive feature engineering. Via extensive experiments, we show
that HD-LSTM outperforms many other neural architectures on two popular
benchmark QA datasets. Empirical studies confirm the effectiveness of
holographic composition over the neural tensor layer.Comment: SIGIR 2017 Full Pape
Quantitative and dark field ghost imaging with ultraviolet light
Ultraviolet (UV) imaging enables a diverse array of applications, such as
material composition analysis, biological fluorescence imaging, and detecting
defects in semiconductor manufacturing. However, scientific-grade UV cameras
with high quantum efficiency are expensive and include a complex thermoelectric
cooling system. Here, we demonstrate a UV computational ghost imaging (UV-CGI)
method to provide a cost-effective UV imaging and detection strategy. By
applying spatial-temporal illumination patterns and using a 325 nm laser
source, a single-pixel detector is enough to reconstruct the images of objects.
To demonstrate its capability for quantitative detection, we use UV-CGI to
distinguish four UV-sensitive sunscreen areas with different densities on a
sample. Furthermore, we demonstrate dark field UV-CGI in both transmission and
reflection schemes. By only collecting the scattered light from objects, we can
detect the edges of pure phase objects and small scratches on a compact disc.
Our results showcase a feasible low-cost solution for non-destructive UV
imaging and detection. By combining it with other imaging techniques, such as
hyperspectral imaging or time-resolved imaging, a compact and versatile UV
computational imaging platform may be realized for future applications.Comment: 9 pages, 5 figure
Question Generation with Minimal Recursion Semantics
Question Generation (QG) is the task of generating reasonable questions from a text. It is a relatively new research topic and has its potential usage in intelligent tutoring systems and closed-domain question answering systems. Current approaches include template or syntax based methods. This thesis proposes a novel approach based entirely on semantics.
Minimal Recursion Semantics (MRS) is a meta-level semantic representation with emphasis on scope underspecification. With the English Resource Grammar and various tools from the DELPH-IN community, a natural language sentence can be interpreted as an MRS structure by parsing, and an MRS structure can be realized as a natural language sentence through generation.
There are three issues emerging from semantics-based QG: (1) sentence simplification for complex sentences, (2) question transformation for declarative sentences, and (3) generation ranking. Three solutions are also proposed: (1) MRS decomposition through a Connected Dependency MRS Graph, (2) MRS transformation from declarative sentences to interrogative sentences, and (3) question ranking by simple language models atop a MaxEnt-based model.
The evaluation is conducted in the context of the Question Generation Shared Task and Generation Challenge 2010. The performance of proposed method is compared against other syntax and rule based systems. The result also reveals the challenges of current research on question generation and indicates direction for future work.
FEATURE-DRIVEN QUESTION ANSWERING WITH NATURAL LANGUAGE ALIGNMENT
Question Answering (QA) is the task of automatically generating answers to natural language questions from humans, serving as one of the primary research areas in natural language human-computer interaction. This dissertation focuses on English fact-seeking (factoid) QA, for instance: when was Johns Hopkins founded? (January 22, 1876).
The key challenge in QA is the generation and recognition of indicative signals for answer patterns. In this dissertation I propose the idea of feature-driven QA, a machine learning framework that automatically produces rich features from linguistic annotations of answer fragments and encodes them in compact log-linear models. These features are further enhanced by tightly coupling the question and answer snippets via monolingual alignment. In this work monolingual alignment helps question answering in two aspects: aligning semantically similar words in QA sentence pairs (with the ability to recognize paraphrases and entailment) and aligning natural language words with knowledge base relations (via web-scale data mining). With the help of modern search engines, database and machine learning tools, the proposed method is able to efficiently search through billions of facts in the web space and optimize from millions of linguistic signals in the feature space.
QA is often modeled as a pipeline of the form:
question (input) ->
information retrieval (“search”) ->
answer extraction (from either text or knowledge base) ->
answer (output).
This dissertation demonstrates the feature-driven approach applied throughout the QA pipeline: the search front end with structured information retrieval, the answer extraction back end from both unstructured data source (free text) and structured data source (knowledge base). Error propagation in natural language processing (NLP) pipelines is contained and minimized. The final system achieves state-of-the-art performance in several NLP tasks, including answer sentence ranking and answer extraction on one QA dataset, monolingual alignment on two annotated datasets, and question answering from Freebase with web queries. This dissertation shows the capability of a feature-driven framework serving as the statistical backbone of modern question answering systems